Search CORE

273 research outputs found

Overview of the CLEF-2019 Checkthat! LAB: Automatic identification and verification of claims. Task 2: Evidence and factuality

Author: Barron-Cedeno A.
Elsayed T.
Hasanain M.
Nakov P.
Suwaileh R.
Publication venue: CEUR-WS
Publication date: 01/01/2019
Field of study

We present an overview of Task 2 of the second edition of the CheckThat! Lab at CLEF 2019. Task 2 asked (A) to rank a given set of Web pages with respect to a check-worthy claim based on their usefulness for fact-checking that claim, (B) to classify these same Web pages according to their degree of usefulness for fact-checking the target claim, (C) to identify useful passages from these pages, and (D) to use the useful pages to predict the claim's factuality. Task 2 at CheckThat! provided a full evaluation framework, consisting of data in Arabic (gathered and annotated from scratch) and evaluation based on normalized discounted cumulative gain (nDCG) for ranking, and F1 for classification. Four teams submitted runs. The most successful approach to subtask A used learning-to-rank, while different classifiers were used in the other subtasks. We release to the research community all datasets from the lab as well as the evaluation scripts, which should enable further research in the important task of evidence-based automatic claim verification

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Dense vs. Sparse representations for news stream clustering

Author: Barron-Cedeno A.
Da San Martino G.
Nakov P.
Staykovski T.
Publication venue: CEUR-WS
Publication date: 01/01/2019
Field of study

The abundance of news being generated on a daily basis has made it hard, if not impossible, to monitor all news developments. Thus, there is an increasing need for accurate tools that can organize the news for easier exploration. Typically, this means clustering the news stream, and then connecting the clusters into story lines. Here, we focus on the clustering step, using a local topic graph and a community detection algorithm. Traditionally, news clustering was done using sparse vector representations with TF\u2013IDF weighting, but more recently dense representations have emerged as a popular alternative. Here, we compare these two representations, as well as combinations thereof. The evaluation results on a standard dataset show a sizeable improvement over the state of the art both for the standard F1 as well as for a BCubed version thereof, which we argue is more suitable for the task

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Prta: A System to Support the Analysis of Propaganda Techniques in the News

Author: Barron-Cedeno A
Da San Martino G
Nakov P
Shaar S
Yu SH
Zhang YF
Publication venue: ASSOC COMPUTATIONAL LINGUISTICS-ACL
Publication date: 01/01/2020
Field of study

Recent events, such as the 2016 US Presidential Campaign, Brexit and the COVID-19 "infodemic", have brought into the spotlight the dangers of online disinformation. There has been a lot of research focusing on fact-checking and disinformation detection. However, little attention has been paid to the specific rhetorical and psychological techniques used to convey propaganda messages. Revealing the use of such techniques can help promote media literacy and critical thinking, and eventually contribute to limiting the impact of "fake news" and disinformation campaigns.Prta (Propaganda Persuasion Techniques Analyzer) allows users to explore the articles crawled on a regular basis by highlighting the spans in which propaganda techniques occur and to compare them on the basis of their use of propaganda techniques. The system further reports statistics about the use of such techniques, overall and over time, or according to filtering criteria specified by the user based on time interval, keywords, and/or political orientation of the media. Moreover, it allows users to analyze any text or URL through a dedicated interface or via an API. The system is available online: https://www.tanbih.org/prta

Archivio istituzionale della ricerca - Università di Padova

Semantic Sentiment Analysis of Twitter Data

Author: B Jansen
B Liu
B Pang
F Sebastiani
G Forman
J Bollen
J Villena-Roman
J Wiebe
J Wiebe
JW Pennebaker
KW Church
P Nakov
PJ Stone
S Burton
S Kiritchenko
SR Das
Publication venue
Publication date: 04/10/2017
Field of study

Internet and the proliferation of smart mobile devices have changed the way information is created, shared, and spreads, e.g., microblogs such as Twitter, weblogs such as LiveJournal, social networks such as Facebook, and instant messengers such as Skype and WhatsApp are now commonly used to share thoughts and opinions about anything in the surrounding world. This has resulted in the proliferation of social media content, thus creating new opportunities to study public opinion at a scale that was never possible before. Naturally, this abundance of data has quickly attracted business and research interest from various fields including marketing, political science, and social studies, among many others, which are interested in questions like these: Do people like the new Apple Watch? Do Americans support ObamaCare? How do Scottish feel about the Brexit? Answering these questions requires studying the sentiment of opinions people express in social media, which has given rise to the fast growth of the field of sentiment analysis in social media, with Twitter being especially popular for research due to its scale, representativeness, variety of topics discussed, as well as ease of public access to its messages. Here we present an overview of work on sentiment analysis on Twitter.Comment: Microblog sentiment analysis; Twitter opinion mining; In the Encyclopedia on Social Network Analysis and Mining (ESNAM), Second edition. 201

arXiv.org e-Print Archive

Crossref

Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 2: Factuality

Author: Atanasova P.
Barron-Cedeno A.
Da San Martino G.
Elsayed T.
Kyuchukov S.
Marquez L.
Nakov P.
Suwaileh R.
Zaghouani W.
Publication venue: CEUR-WS
Publication date: 01/01/2018
Field of study

We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims, with focus on Task 2: Factuality. The task asked to assess whether a given check-worthy claim made by a politician in the context of a debate/speech is factually true, half-true, or false. In terms of data, we focused on debates from the 2016 US Presidential Campaign, as well as on some speeches during and after the campaign (we also provided translations in Arabic), and we relied on comments and factuality judgments from factcheck.org and snopes.com, which we further refined manually. A total of 30 teams registered to participate in the lab, and five of them actually submitted runs. The most successful approaches used by the participants relied on the automatic retrieval of evidence from the Web. Similarities and other relationships between the claim and the retrieved documents were used as input to classifiers in order to make a decision. The best-performing official submissions achieved mean absolute error of .705 and .658 for the English and for the Arabic test sets, respectively. This leaves plenty of room for further improvement, and thus we release all datasets and the scoring scripts, which should enable further research in fact-checking

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Thread-level information for comment classification in community question answering

Author: Barron-Cedeno A.
Da San Martino G.
Filice S.
Joty S.
Marquez L.
Moschitti A.
Nakov P.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

Community Question Answering (cQA) is a new application of QA in social contexts (e.g., fora). It presents new interesting challenges and research directions, e.g., exploiting the dependencies between the different comments of a thread to select the best answer for a given question. In this paper, we explored two ways of modeling such dependencies: (i) by designing specific features looking globally at the thread; and (ii) by applying structure prediction models. We trained and evaluated our models on data from SemEval-2015 Task 3 on Answer Selection in cQA. Our experiments show that: (i) the thread-level features consistently improve the performance for a variety of machine learning models, yielding state-of-the-art results; and (ii) sequential dependencies between the answer labels captured by structured prediction models are not enough to improve the results, indicating that more information is needed in the joint model

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 1: Check-Worthiness

Author: Atanasova P.
Barron-Cedeno A.
Da San Martino G.
Elsayed T.
Kyuchukov S.
Marquez L.
Nakov P.
Suwaileh R.
Zaghouani W.
Publication venue: CEUR-WS
Publication date: 01/01/2018
Field of study

We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims, with focus on Task 1: Check-Worthiness. The task asks to predict which claims in a political debate should be prioritized for fact-checking. In particular, given a debate or a political speech, the goal was to produce a ranked list of its sentences based on their worthiness for fact checking. We offered the task in both English and Arabic, based on debates from the 2016 US Presidential Campaign, as well as on some speeches during and after the campaign. A total of 30 teams registered to participate in the Lab and seven teams actually submitted systems for Task 1. The most successful approaches used by the participants relied on recurrent and multi-layer neural networks, as well as on combinations of distributional representations, on matchings claims' vocabulary against lexicons, and on measures of syntactic dependency. The best systems achieved mean average precision of 0.18 and 0.15 on the English and on the Arabic test datasets, respectively. This leaves large room for further improvement, and thus we release all datasets and the scoring scripts, which should enable further research in check-worthiness estimation

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Overview of the CLEF-2018 checkthat! lab on automatic identification and verification of political claims

Author: Atanasova P.
Barron-Cedeno A.
Da San Martino G.
Elsayed T.
Kyuchukov S.
Marquez L.
Nakov P.
Suwaileh R.
Zaghouani W.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. In its starting year, the lab featured two tasks. Task 1 asked to predict which (potential) claims in a political debate should be prioritized for fact-checking; in particular, given a debate or a political speech, the goal was to produce a ranked list of its sentences based on their worthiness for fact-checking. Task 2 asked to assess whether a given check-worthy claim made by a politician in the context of a debate/speech is factually true, half-true, or false. We offered both tasks in English and in Arabic. In terms of data, for both tasks, we focused on debates from the 2016 US Presidential Campaign, as well as on some speeches during and after the campaign (we also provided translations in Arabic), and we relied on comments and factuality judgments from factcheck.org and snopes.com, which we further refined manually. A total of 30 teams registered to participate in the lab, and 9 of them actually submitted runs. The evaluation results show that the most successful approaches used various neural networks (esp. for Task 1) and evidence retrieval from the Web (esp. for Task 2). We release all datasets, the evaluation scripts, and the submissions by the participants, which should enable further research in both check-worthiness estimation and automatic claim verification

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Challenging others when posting misinformation: a UK vs. Arab cross-cultural comparison on the perception of negative consequences and injunctive norms

Author: Ali R.
Gurgun Selin
Nakov P.
Noman M.
Phalp Keith
Publication venue
Publication date: 01/01/2024
Field of study

This study investigates the factors influencing the willingness to challenge misinformation on social media across two cultural contexts, the United Kingdom (UK) and Arab countries. A total of 462 participants completed an online survey (250 UK, 212 Arabs). The analysis revealed that three types of negative consequences (relationship cost, negative impact on the person being challenged, futility) and also injunctive norms influence the willingness to challenge misinformation. Cross-cultural comparisons using t-tests showed significant differences between the UK and the Arab countries in all factors except the injunctive norms. Multiple regression analyses identified differences between the UK and Arab participants concerning which of the factors predicted the willingness to challenge misinformation. The findings suggest that participants’ self-reported injunctive norms play a significant role in shaping their willingness to engage in corrective actions across both cultural contexts. Moreover, UK participants’ reporting of how others perceive negative impact on the person being challenged and injunctive norms were significant predictors, while for the Arabs, only the perceived relationship costs emerged as a significant predictor. This study has important implications for policymakers and social media platforms in developing culturally sensitive interventions encouraging users to correct misinformation

Bournemouth University Research Online

Why do we not stand up to misinformation? Factors influencing the likelihood of challenging misinformation on social media and the role of demographics

Author: Ali R.
Arden Close Emily
Cemiloglu Deniz
Gurgun Selin
Nakov P.
Phalp Keith
Publication venue
Publication date: 01/03/2024
Field of study

This study investigates the barriers to challenging others who post misinformation on social media platforms. We conducted a survey amongst U.K. Facebook users (143 (57.2 %) women, 104 (41.6 %) men) to assess the extent to which the barriers to correcting others, as identified in literature across disciplines, apply to correcting misinformation on social media. We also group the barriers into factors and explore demographic differences amongst them. It has been suggested that users are generally hesitant to challenge misinformation. We found that most of our participants (58.8 %) were reluctant to challenge misinformation. We also identified moderating roles of age and gender in the likelihood of challenging misinformation. Older people were more likely to challenge misinformation compared to young adults while, men demonstrated a slightly greater likelihood to challenge compared to women. The 20 barriers influencing the decision to challenge misinformation, were then grouped into four main factors: social concerns, effort/interest considerations, prosocial intents, and content-related factors. We found that, controlling for age and gender, “social concerns” and “effort/interest considerations” have the significant impact on likelihood to challenge. Identified four factors were analysed in terms of demographic differences. Men ranked “effort/interest considerations” higher than women, while women placed higher importance on “content-related factors”. Moreover, older individuals were found to be more resilient to “social concerns”. The influence of educational background was most prominent in ranking “content-related factors”. Our findings provide important insights for the design of future interventions aimed at encouraging the challenging of misinformation on social media platforms, highlighting the need for tailored, demographically sensitive approaches

Bournemouth University Research Online